Skip to content

Conversation

@iluise
Copy link
Collaborator

@iluise iluise commented Sep 19, 2025

Description

Add reader to retrieve CSV scores from files generated with Quaver. @Jubeku

quaver scores can be plotted in the FastEvaluation package by adding this in the config (you need to have them locally first):

  nhem_son2022_24h_ifs_oper_an:
    type: "csv"
    label: "IFS Quaver ERA5"
    csv_path: "./scores/scores_nhem_son2022_24h_ifs_oper_an.csv" 
    metrics_dir: "./scores/"
    metric: "rmse"
    region: "nhem"
    streams: 
      ERA5:
        channels: ["2t", "10ff", "q_850", "t_850", "z_500"]
        evaluation: 
          forecast_step: "all"
          sample: "all"

Issue Number

Closes #930

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@iluise iluise self-assigned this Sep 19, 2025
@iluise iluise added the eval anything related to the model evaluation pipeline label Sep 19, 2025
@clessig clessig requested a review from tjhunter September 19, 2025 20:39
Copy link
Collaborator

@tjhunter tjhunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iluise I have not tried it but I trust you did. Approved, and here are a couple of comments (small)

self.csv_path = eval_cfg.get("csv_path")
assert self.csv_path is not None, "CSV path must be provided in the config."

self.data = pd.read_csv(self.csv_path, index_col=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing I would do is cast all the values to np.float32 (or float). pandas tries to be very clever and would for example use int32 if the data allows. I am not sure if xarray can deal with that later.

Copy link
Contributor

@grassesi grassesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you have a object that can be an instance of different subclasses and you want different behavior depending on the subclass, you should use polymorphism to implement it: Make a method with the same function signature in both subclasses and implement the different behavior there. The advantage is that the caller does not have to have any knowledge of the differnces (As you do here).

@github-project-automation github-project-automation bot moved this to In Progress in WeatherGen-dev Oct 29, 2025
@iluise
Copy link
Collaborator Author

iluise commented Nov 5, 2025

Hi @grassesi thanks for the suggestion, it looks much better now. I've updated the code and tested it. Can you check if this is what you expected and eventually approve the PR?

Copy link
Contributor

@grassesi grassesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thats exactly what I meant 👍

@grassesi grassesi merged commit 4f4097c into develop Nov 6, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in WeatherGen-dev Nov 6, 2025
TillHae pushed a commit to TillHae/WeatherGenerator that referenced this pull request Nov 13, 2025
* first version of quaver reader

* working version

* add CSVReader

* rebase to develop

* add polimorphism

* fix names

* lint
MatKbauer pushed a commit that referenced this pull request Nov 17, 2025
* training progress unit realignment from epoch to mini_epoch

* small naming fix of mini_epoch

* ruffed

* linted

* Fix out of bounds in data_reader_obs (#1180)

* fix out of bounds access

* Adding comment

* Removed debgu

* Fixed to use forward function for forecast engine (#1188)

* Fixed to use forward function for forecast engine, and also fstep for conditioning

* Fixed missing return statement

* Enable FesomDataReader to have different source and target datasets (#1046)

* Implement separate target and source files, adjust masking

* Fix casual masking

* Fix longitude conversion flag

* Fix casual masking strategy

---------

Co-authored-by: Seb Hickman <[email protected]>

* Add support for constant learning rate (#1186)

* Added support for constant learning rate and minor clean-up in code

* Fixed issues with overlap between lr phases

* Changing default lr to constant

* [issue 1123] restore probabilistic scores (#1128)

* rebase

* add ensemble

* fix deterministic

* fix plotting

* lint

* fix eval_config

* probabilistic scores working now

* lint

* Fix spoofing and refactor handling of multiple source files (#1118)

* Cleaning up spoofing and related code on data preprocessing for model

* Fixed typo

* Updated comments

* Removed merge cells and implemented necessary adjustments

* Fixed forecasting

* Fixed missing handling of NaNs in coordinates and channel data

* Minor clean up

* Fix to removing/renaming variables

* Changed funtion name to improve readability

* Fixed bug with incorrect handling of multiple input datasources.

* Addressed reviewer comments

* resolve conflict

* [1131] fixes circular dependencies (#1134)

* fixes dependencies

* cleanup

* make the type checker not fail

* cleanup

* cleanup of type issues

* Give option to plot only prediction maps (#1139)

* add plot_preds_only feature

* minor changes after comments

* Tell FSDP2 about embedding engine forward functions (#1133)

* Tell FSDP2 about embedding engine forward functions

Note DO NOT add print functions in forward functions of the model, it
will break with FSDP2

* Add comment

* recover 'all' option (#1146)

* Fixed problem in inferecne (#1145)

* implement vrmse (#1147)

* [1144] Extra fixes (#1148)

* Fixed problem in inferecne

* more fixes

* fixes

* lint

* lint

---------

Co-authored-by: Christian Lessig <[email protected]>

* Jk/log grad norms/log grad norms (#1068)

* Log gradient norms

* Prototype for recording grad norms

* Address review changes + hide behind feature flag

* Final fixes including backward compatibility

* Ruff

* More ruff stuff

* forecast config with small decoder

* fixed uv.lock

* test gradient logging on mutli gpus

* update uv.lock to latest develop version

* revert to default confit

* add comment on FSDP2 specifics

* move plot grad script to private repo

* rm seaborn from pyproject

* updating terminal and metrics loggin, add get_tensor_item fct

* check for DTensor instead of world size

* revert forecast fct, fix in separate PR

* rename grad_norm log names to exclude from MLFlow

* add log_grad_norms to default config

---------

Co-authored-by: sophiex <[email protected]>

* Add forecast and observation activity (#1126)

* Add calculation methods for forecast and observation activity metrics in Scores class

* Add new calculation methods for forecast activity metrics in Scores class

* ruff

* fix func name

* Rename observation activity calculation method to target activity in Scores class

* typo

* refactor to common calc_act function for activity

* fix cases

* have calc_tact and calc_fact that use _calc_act for maintainability

* fix small thing in style

---------

Co-authored-by: iluise <[email protected]>

* hotfix: use correct methot `create` instead of `construct` (#1090)

* restore develop

* fix deterministic

* fix plotting

* lint

* fix eval_config

* probabilistic scores working now

* lint

* update utils

* packages/evaluate/src/weathergen/evaluate/score.py

* lint

* removing duplication

---------

Co-authored-by: Christian Lessig <[email protected]>
Co-authored-by: Timothy Hunter <[email protected]>
Co-authored-by: Savvas Melidonis <[email protected]>
Co-authored-by: Sophie X <[email protected]>
Co-authored-by: Julius Polz <[email protected]>
Co-authored-by: Julian Kuehnert <[email protected]>
Co-authored-by: Simon Grasse <[email protected]>

* Adding config to issue templates

The issue template seems to have disappeared, attempting to solve that.

* Add the duration of animation as global plotting option (#1189)

* Add the animation duration as global plotting option

* Linting

* Use FPS instead of milliseconds

* Linting

* Attempt to fix the bug report template

* Attempt to fix initiative template

* Update task template

* [1081][Evaluation] Use parent ruff rules (#1177)

* use ruff settings from parent

* fix code checks

* check fixes 2nd round

* reformat to line length

* [1092] Adds pushing metrics to the evaluation pipeline (#1127)

* changes

* changes

* changes

* changes

* changes

* scores successfully pushed to MLFlow, still need to refactor

* try to batch upload all metrics form same runid

* batch logging all scores of each run_id

* get parent_run by from_run_id

* changes

* cleanups

* bug fixes

* typing issue

* Cleanup

* pdb

* integration test

---------

Co-authored-by: Jubeku <[email protected]>

* Fix the issue - "Empty source still have embedding network" (#1114)

* Replace cf.rank==0 with utils.distributed.is_root

* fix empty source inputs still have embedding layer

* fix lint

* fix source empty or source exclude all

* fix source empty or source exclude all

* fix forecast mode empty source

---------

Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>

* [930][evaluation] implement CSVReader (#932)

* first version of quaver reader

* working version

* add CSVReader

* rebase to develop

* add polimorphism

* fix names

* lint

* Iluise/hot fixes (#1209)

* fix froct

* fix 1150

* Fix plot_train verbosity (#1225)

* [1206] Experimentation for extra data readers (#1207)

* initial implementation

* changes

* toml

* add module to annotations.json (#1142)

Co-authored-by: Javad Kasravi <[email protected]>

* Correct bug with score cards and bar plots for different metrics (#1192)

* Rebase to develop

* Linting

* Address comments and linting

* [eval][1122] Plot scores on a map (#1176)

* first version of score maps

* add maps to compute_scores

* fix single sample situation

* fix single sample

* lint

* restore score.py

* fix bug in metric stream

* default flag to false

* Minor correction, a line was deleted by mistake? (#1193)

* fix

* working setup for regridded data

* fix missing valid time case

* lint and fix color in score cards

* fix path for score maps

* Allow plotting score maps every time

---------

Co-authored-by: Savvas Melidonis <[email protected]>

* Fix DDP without FSDP (#1227)

* Fix DDP without FSDP

* Fixed taht freezing would not have worked with only DDP

* refactor export scripts - Part 1 (#1223)

* move stuff around

* move stuff around

* rename files

* [1034][reader_extra] E-Obs datareader (#1228)

* [1034] rebase

* [1034] add dataloader

* [1034] Zarr3-->Zarr2

* [1034] lint

* [1034] lint

* [1034] Moved to reader_extra

* [1034] registry E-Obs

* training progress unit realignment from epoch to mini_epoch

* ruffed

* check if path is dir in io_reader

* fix overwrite of fname_zarr in io_reader

* add backward compatibility to config read

* Separate write and read functions for model*chkpt*.json files (MatKBauer)

---------

Co-authored-by: Christian Lessig <[email protected]>
Co-authored-by: Kacper Nowak <[email protected]>
Co-authored-by: Seb Hickman <[email protected]>
Co-authored-by: iluise <[email protected]>
Co-authored-by: Timothy Hunter <[email protected]>
Co-authored-by: Savvas Melidonis <[email protected]>
Co-authored-by: Sophie X <[email protected]>
Co-authored-by: Julius Polz <[email protected]>
Co-authored-by: Julian Kuehnert <[email protected]>
Co-authored-by: Simon Grasse <[email protected]>
Co-authored-by: Michael Tarnawa <[email protected]>
Co-authored-by: Jubeku <[email protected]>
Co-authored-by: Jifeng Wang <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: wang85 <[email protected]>
Co-authored-by: Javad kasravi <[email protected]>
Co-authored-by: Javad Kasravi <[email protected]>
Co-authored-by: Simone Norberti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

eval anything related to the model evaluation pipeline

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

implement CSVReader to read quaver scores in FastEvaluation package

4 participants